Learning Probabilistic Subcategorization Preference and its Application to Syntactic Disambiguation
نویسندگان
چکیده
This paper proposes a novel method of learning probabilistic subcategorization preference. In the method, for the purpose of coping with the ambiguities of case dependencies and noun class generalization of argument/adjunct nouns, we introduce a data structure which represents a tuple of independent partial subcategorization frames. Each collocation of a verb and argument/adjunct nouns is assumed to be generated from one of the possible tuples of independent partial subcategorization frames. Parameters of subcategorization preference are then estimated so as to maximize the subcategorization preference function for each collocation of a verb and argument/adjunct nouns in the training corpus. We also describe the results of the experiments on learning probabilistic subcategorization preference from the EDR Japanese bracketed corpus, as well as those on evaluating the performance of subcategorization preference. This paper is an extended version of the paper presented at the Fifth Conference on Applied Natural Language Processing, 1997.
منابع مشابه
Unsupervised Learning for Syntactic Disambiguation
We present a methodology framework for syntactic disambiguation in natural language texts. The method takes advantage of an existing manually compiled non-probabilistic and nonlexicalized grammar, and turns it into a probabilistic lexicalized grammar by automatically learning a kind of subcategorization frames or selectional preferences for all words observed in the training corpus. The diction...
متن کاملLearning Probabilistic Subcategorization Preference by Identifying Case Dependencies and Optimal Noun Class Generalization Level
This paper proposes a novel method of learning probabilistic subcategorization preference. In the method, for the purpose of coping with the ambiguities of case dependencies and noun class generalization of argument/adjunct nouns, we introduce a data structure which represents a tuple of independent partial subcategorization frames. Each collocation of a verb and argument/adjunct nouns is assum...
متن کاملSelectional preference acquisition through matrix factorization with missing data
Words in an utterance are not placed in their respective slots randomly from a uniform distribution. In English, for example, a verb will rarely, if ever, follow a determiner. This is a syntactic restriction. From another perspective, one would not expect to find a word such as defenestration as the object of eat. This is what is known as the selectional preference of a word for another word in...
متن کاملMaximum Entropy Model Learning of Subcategorization Preference
Abstract This paper proposes a novel method for learning probabilistic models of subcategorization preference of verbs. Especially, we propose to consider the issues of case dependencie~ and noun class generalization in a uniform way. We adopt the maximum entropy model learn~,g method and apply it to the task of model learning of subcategorization preference. Case dependencies and noun class ge...
متن کاملApplication of finite-state transducers to the acquisition of verb subcategorization information
This paper presents the design and implementation of a finite-state syntactic grammar of Basque that has been used with the objective of extracting information about verb subcategorization instances from newspaper texts. After a partial parser has built basic syntactic units such as noun phrases, prepositional phrases, and sentential complements, a finite-state parser performs syntactic disambi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997